-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: move validate-after-version-upgrade to new framework #113643
roachtest: move validate-after-version-upgrade to new framework #113643
Conversation
I'm wondering -- could this test be written as a logictest instead? It seems like the kind of check that doesn't really need all the machinery of a real cluster. If that's not an option, another alternative I'd suggest considering is to piggyback on |
I didn't see a way to write this as a logictest when I considered it. The issue is that logictests don't have a way to wipe the cluster. This test starts a cluster on the latest version to get the system tables, then wipes the cluster, then starts it on the predecessor version and upgrades it. That is actually somewhat of a "complex" operation, so I think it does make sense as a roachtest, which provides that flexibility. Does the version-upgrade test you mentioned do something similar? |
bfd9e9e
to
3a30edb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much cleaner! Thanks!
My understanding of the |
That's true, but it also sounds like it wouldn't be too bad to implement (wipe is just Another alternative to consider: how long does it take to run this test in |
We would just need to have the exact same logic to load the |
|
||
// Compare the results. | ||
validateEquivalenceStep(&expected, &actual), | ||
binary := uploadVersion(ctx, t, c, c.All(), clusterupgrade.CurrentVersion()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can skip this step and use install.BinaryOption(test.DefaultCockroachPath)
below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
expected = obtainSystemSchema(ctx, t.L(), c, 1) | ||
c.Wipe(ctx, false /* preserveCerts */, c.All()) | ||
|
||
mvt := mixedversion.NewTest(ctx, t, t.L(), c, c.All(), mixedversion.NeverUseFixtures) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Is this test incompatible with fixtures?
- I believe you'll want
mixedversion.NumUpgrades(1)
here to force we are only performing one upgrade in this test. Otherwise it might choose to perform several upgrades in the same test: it go through 22.2 -> 23.1 -> master.AfterUpgradeFinalized
is called after each upgrade.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this test incompatible with fixtures?
According to this comment, yes.
// NeverUseFixtures is an option that can be passed to `NewTest` to
// disable the use of fixtures in the test. Necessary if the test
// wants to use a number of cockroach nodes other than 4.
I believe you'll want mixedversion.NumUpgrades(1) here to force we are only performing one upgrade in this test. Otherwise it might choose to perform several upgrades in the same test: it go through 22.2 -> 23.1 -> master. AfterUpgradeFinalized is called after each upgrade.
Ah nice, thanks for the pointer.
Hm but it requires a few changes to the cockroach-go/testserver API. I don't have the spare cycles to implement that myself, but I'm not opposed if you want to try adding that. Additionally, the mixed version logictest framework doesn't have a way to tell it to start on a specific version, and adding that could make the directives/syntax a bit complicated.
Yeah a local version does make sense. I will work on that. |
477cdec
to
9c7f2ef
Compare
9c7f2ef
to
a34dba9
Compare
I was running this test on my gceworker and it failed, and it seems to be failing on TC as well. Code looks good to me, not sure if it's a real bug or something is wrong in the test setup. |
I see this failure too. i am surprised by it. a diff like this makes it seem like something is really off
i will debug further |
It looks like the reason it's failing is due to issues with system schema migrations that predate this test. One example is dd7cbf2. In that commit, the
But the confusing thing is that dd7cbf2 is added as part of the upgrade to v22.1.0. According to the test logs, this test started on v23.1.11... so I'm extremely confused why we'd see a v22.1.0 migration being executed. |
Not sure if it helps, but the cluster was bootstrapped on 22.2:
|
Where can I find that? I don't see it in the artifacts at |
Ah, I was looking at the |
i see, well regardless, if we want this test to be part of |
Maybe the problem is that these tests are bootstrapping from fixtures, and the fixtures were created before the fixes you mentioned were merged? Yes, given what we learned I think we should not be using
|
a34dba9
to
09b5d1b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this test should only run in local
mode, but I'll address that together with a few other tests in follow-up work. Thanks for iterating on the approach!
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @annrpom and @herkolategan)
thanks for the reviews! bors r+ |
Build failed: |
{ | ||
name: "validate-system-schema-after-version-upgrade", | ||
fn: runValidateSystemSchemaAfterVersionUpgrade, | ||
timeout: 30 * time.Minute, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need defaultLeases: true
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
The new framework provides a few more testing enhancements and is the only one that will be maintained. It also adds the test to the suite of local acceptance tests that are run in CI. Release note: None
09b5d1b
to
e76ddd3
Compare
bors r+ |
bors r+ |
Build succeeded: |
blathers backport 23.1 |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from e76ddd3 to blathers/backport-release-23.1-113643: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 23.1 failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
We should probably backport to 23.1 too. |
The new framework provides a few more testing enhancements and is the only one that will be maintained.
fixes #110535
Release note: None